data(CPS1988)
# I prefer to conver the data to data.table.
setDT(CPS1988)Introduction to Modesummary package
1 Before Starting
Make sure you are turn on the “Render on Save”. This let you see see changes of the quarto document you are working on without having to re-render the output file every time you save this file (Cmd + s for MAC users, Ctrl+S for Windows users).
1.1 Lerning Objectives:
- By the end of this section, you know how to use the
modelsummarypackage to create regression and summary tables that are of publication quality.
1.2 Data
2 Introduction to modelsummary package
2.1 Intoductoin
modelsummary package let you create a nice summary table to report the discriptive statistics of the data and the regression results.
Today, we mainly use two functions in the modelsummary package:
datasummary(): to create a summary table for the descriptive statistics of the data.modelsummary(): to create a summary table for the regression results.
Check the documentation for more details.
Note
- There is another package called
stargazerthat can create a summary table, but it is not maintained anymore. So, I recommend to usemodelsummarypackage. modelsummarypackage is compatible with
2.2 The Taste of modelsummary package
In the below, I show what kind of tables can be created with modelsummary package. Don’t try to understand the code for now. See the output tables.
Code
datasummary(
wage + education + experience ~ Mean + SD + Min + Max,
data = CPS1988
)| Mean | SD | Min | Max | |
|---|---|---|---|---|
| wage | 603.73 | 453.55 | 50.05 | 18777.20 |
| education | 13.07 | 2.90 | 0.00 | 18.00 |
| experience | 18.20 | 13.08 | -4.00 | 63.00 |
Code
# change the base group for ethnicity to "cauc"
ex_dt <-
copy(CPS1988) %>%
.[,ethnicity := relevel(as.factor(ethnicity), ref = "cauc")]
ls_regs <-
list(
"OLS 1" = lm(log(wage) ~ education, data = ex_dt),
"OLS 2" = lm(log(wage) ~ education + experience + I(experience^2), data = ex_dt),
"OLS 3" = lm(log(wage) ~ education + experience + I(experience^2) + ethnicity, data = ex_dt)
)
modelsummary(
models = ls_regs,
output = "latex",
coef_map = c(
"education" = "Education",
"experience" = "Experience",
"I(experience^2)" = "Experience squared",
"ethnicityafam" = "White"
),
stars = c("*" = .05, "**" = .01, "***" = .001),
gof_map = c("nobs", "r.squared", "adj.r.squared"),
notes = list("Std. Errors in parentheses")
)| OLS 1 | OLS 2 | OLS 3 | |
|---|---|---|---|
| * p < 0.05, ** p < 0.01, *** p < 0.001 | |||
| Std. Errors in parentheses | |||
| Education | \num{0.076}*** | \num{0.087}*** | \num{0.086}*** |
| (\num{0.001}) | (\num{0.001}) | (\num{0.001}) | |
| Experience | \num{0.078}*** | \num{0.077}*** | |
| (\num{0.001}) | (\num{0.001}) | ||
| Experience squared | \num{-0.001}*** | \num{-0.001}*** | |
| (\num{0.000}) | (\num{0.000}) | ||
| White | \num{-0.243}*** | ||
| (\num{0.013}) | |||
| Num.Obs. | \num{28155} | \num{28155} | \num{28155} |
| R2 | \num{0.095} | \num{0.326} | \num{0.335} |
| R2 Adj. | \num{0.095} | \num{0.326} | \num{0.335} |
2.3 Regression Tables with modelsummary() function
Let’s start with the modelsummary() function to create a summary table for the regression results.
2.4 modelsummary() function
2.4.1 Basics
Example
reg1 <- lm(log(wage) ~ education, data = CPS1988)
reg2 <- lm(log(wage) ~ education + experience + I(experience^2), data = CPS1988)
modelsummary(models=list(reg1, reg2))| (1) | (2) | |
|---|---|---|
| (Intercept) | 5.178 | 4.278 |
| (0.019) | (0.019) | |
| education | 0.076 | 0.087 |
| (0.001) | (0.001) | |
| experience | 0.078 | |
| (0.001) | ||
| I(experience^2) | -0.001 | |
| (0.000) | ||
| Num.Obs. | 28155 | 28155 |
| R2 | 0.095 | 0.326 |
| R2 Adj. | 0.095 | 0.326 |
| AIC | 405753.0 | 397432.7 |
| BIC | 405777.7 | 397473.9 |
| Log.Lik. | -29139.853 | -24977.715 |
| F | 2941.787 | 4545.929 |
| RMSE | 0.68 | 0.59 |
modelsummary(
models = list("OLS 1" = reg1, "OLS 2" = reg2),
coef_map = c(
"education" = "Education",
"experience" = "Experience",
"I(experience^2)" = "Experience squared"
),
stars = c("*" = .05, "**" = .01, "***" = .001)
# coef_omit = 1
)| OLS 1 | OLS 2 | |
|---|---|---|
| * p < 0.05, ** p < 0.01, *** p < 0.001 | ||
| Education | 0.076*** | 0.087*** |
| (0.001) | (0.001) | |
| Experience | 0.078*** | |
| (0.001) | ||
| Experience squared | -0.001*** | |
| (0.000) | ||
| Num.Obs. | 28155 | 28155 |
| R2 | 0.095 | 0.326 |
| R2 Adj. | 0.095 | 0.326 |
| AIC | 405753.0 | 397432.7 |
| BIC | 405777.7 | 397473.9 |
| Log.Lik. | -29139.853 | -24977.715 |
| F | 2941.787 | 4545.929 |
| RMSE | 0.68 | 0.59 |
msummary(
models = list("OLS 1" = reg1, "OLS 2" = reg2),
coef_map = c(
"education" = "Education",
"experience" = "Experience",
"I(experience^2)" = "Experience squared"
),
stars = c("*" = .05, "**" = .01, "***" = .001)
)| OLS 1 | OLS 2 | |
|---|---|---|
| * p < 0.05, ** p < 0.01, *** p < 0.001 | ||
| Education | 0.076*** | 0.087*** |
| (0.001) | (0.001) | |
| Experience | 0.078*** | |
| (0.001) | ||
| Experience squared | -0.001*** | |
| (0.000) | ||
| Num.Obs. | 28155 | 28155 |
| R2 | 0.095 | 0.326 |
| R2 Adj. | 0.095 | 0.326 |
| AIC | 405753.0 | 397432.7 |
| BIC | 405777.7 | 397473.9 |
| Log.Lik. | -29139.853 | -24977.715 |
| F | 2941.787 | 4545.929 |
| RMSE | 0.68 | 0.59 |